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Abstract 

In this paper, we develop a quantified propositional proof systems 
that corresponds to logarithmic-space reasoning. We begin by defining a 
class TiCNF{2) of quantified formulas that can be evaluated in log space. 
Then our new proof system GL* is defined as G\ with cuts restricted to 
T,CNF{2) formulas and no cut formula that is not quantifier free contains 
a free variable that does not appear in the final formula. 

To show that GL* is strong enough to capture log space reason- 
ing, we translate theorems of VL into a family of tautologies that have 
polynomial-size GL* proofs. VL is a theory of bounded arithmetic that 
is known to correspond to logarithmic-space reasoning. To do the trans- 
lation, we find an appropriate axiomatization of VL, and put VL proofs 
into a new normal form. 

To show that GL* is not too strong, we prove the soundness of GL* 
in such a way that it can be formalized in VL. This is done by giving a 
logarithmic-space algorithm that witnesses GL* proofs. 

1 Introduction 

Recently there has been a significant amount of research looking into the 
connection between computational complexity, bounded arithmetic, and 
propositional proof complexity. A recent survey on this topic can be found 
at [6]. The idea is that there is a hierarchy of complexity classes 

AG° C rc" C NC^ C L <Z NL C P. 

The first class is the set of problems that can be solved by uniform, 
polynomial-size, constant depth circuits. This class is important because 
it can be shown that PARITY cannot be solved in AC''. In fact, problems 
that involve counting cannot be solved in AC". The second class is TG". 
This set of problems is the same as AC^ except that rC° circuits can 
use counting gates. The class NG^ is the set of problems that can be 
solved using polynomial-size, logarithmic-depth circuits. This class can 
be thought of as the set of problems that can be solved very quickly when 
work is done in parallel. Evaluating boolean formulas is complete for this 
class. The class L is the set of problems that can be solved in logarithmic 
space on a Turing machine. The class NL is the set of problems that can 
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be solved in logarithmic space on a non-deterministic Turing machine. 
The reachability problem for directed graphs is complete for this class. 
The sequence finishes with P, the set of problems that can be solved in 
polynomial time on a deterministic Turing machine. Except for the first 
inclusion, is it unknown if any of these inclusions are proper. 

Each of these complexity classes has a corresponding theory of arith- 
metic: V", VTC", VNC\VL, VNL, and TV° , respectively. Each of these 
theories can prove that the functions in their corresponding complexity 
class are total. As a consequence, any information we can obtain about 
the theory tells us something about the complexity class and vice versa. 

There is also a connection with prepositional proof complexity. Some 
of the theories mentioned above have a corresponding propositional proof 
system. As before, information about the proof systems tells us about 
the corresponding theory and complexity class. In this paper, we explore 
the proof systems. The goal is to try to understand how the strength of 
a proof system is affected by different restrictions. 

Our focus will be on quantified propositional proof systems, but, to ex- 
plain our method, we will use quantifier-free propositional proof systems. 
Start with a Fregc proof system, sometimes called Hilbert Style Systems. 
These systems are described in standard logic text books. A Frege proof 
is a series of propositional formulas where each formula is an axiom or 
can be inferred from previous formulas using one of the rules of inference. 
There are two common ways of restricting this proof system. The first is 
to restrict all of the formulas in the proof. For example, one definition of 
bounded-depth Fregc is to restrict every formula in the proof to formulas 
with a constant depth. This worked, but, if a proof system is defined 
this way, then there are formulas that cannot be proved simply because 
they are not allowed to appear in the proof. For example, bounded-depth 
Frege with formulas of depth d cannot prove any formula of depth d + 1. 
The other method is to restrict the formulas on which certain rules can 
be applied. This solves the problem of the first method and led to other 
definitions of bounded-depth Frege. 

In this paper, we will look at restricting the cut rule in the tree-like 
sequent calculus for quantified propositional formulas. This systems is 
known as G* . The cut rule derives F — > A from A, F — > A and F — > 
A, A. In G*, j4 can be any quantified propositional formulas. The proof 
system Go is defined by restricting A to quantifier-free formulas. If we are 
given a Gq proof of a formula {3zB{z), where B is quantifier- free) , then 
we can find a witness for existential quantifiers in this formula in uniform 
A'^G^; moreover, this problem is complete for this class. The complexity 
class NC^ is the set of problems that can be solved by polynomial-size, 
logarithmic-depth circuits with fan-in 2. The interesting observation is 
that evaluating quantifier-free formulas is also complete for A'^G^ . It is also 
possible to connect Gq to NC^ indirectly through bounded arithmetic. 
There is a theory of arithmetic VNG^ that is known to correspond to 
NC^ reasoning. Given a VNC^ proof of a bounded formula it is possible 
to translate the proof into a family of polynomial-size Gq proofs. This 
tells us that the reasoning power of Gq is at least as strong as that of 
VNC^ [5]. In the other direction, VNC^ can prove that Gq is sound 
when proving S^' formulas. This means that, when proving formulas. 
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the reasoning power of Go is not stronger than that of VNC^ . So we say 
that Gq corresponds to NC^ reasoning. 

As well, if we restrict cut formulas to constant-depth, quantifier- free 
formulas, we get a proof system that corresponds to AC^ reasoning. 
The complexity class is the set of problems that can be solved by 
polynomial-size, constant-depth circuits with unbounded fan-in. Again, 
evaluating constant-depth formulas is complete for AC^ . We should note 
we are talking about the proofs of quantifier-free formulas. 

This gives us two proof systems whose reasoning power is the same as 
the complexity of evaluating their cut formulas. This raises the question 
of whether or not this holds in general. The quick answer is no. A counter- 
example is G\. Evaluating formulEis is complete for NP, but the 
witnessing problem for GX is complete for P [8] . Another counter-example 
is GPV* , where cut formulas are quantifier-free or formulas of the form 
3a; [a; ^ A\, where ^4 is a quantifier-free formula that does not mention 
X. Evaluating a cut formula for GPV* is complete for ATC^, but the 
witnessing problem is complete for P [14]. 

In this paper, we define a new proof system GL* that corresponds to 
L reasoning. The complexity class L is the set of problems that can be 
solved on a Turing Machine with a read-only input tape and a work tape 
where the space used on the work tape is proportional to the logarithm 
of the size of the input. Our proof system GL* is defined by restricting 
cuts to T,CNF{2) formulas, a set of formulas for which the evaluation 
problem is complete for L. However, that is not enough. We also restrict 
the free variables that appear in cut formulas with quantifiers to variables 
that appear free in the final sequent. We then prove this proof system 
corresponds to L reasoning by connecting it with a theory of arithmetic 
that is known to correspond to L reasoning. This definition is meant to 
demonstrate that the strength of a proof system is not related to the diffi- 
culty of evaluating a single cut formula in the proof, but to the complexity 
of witnessing the eigenvariables in the proof. 

In Section 2, we give definitions of the important concepts. In partic- 
ular, we define two-sorted computational complexity and bounded arith- 
metic. As well, we define the standard proof systems and explain the 
connection between proof systems and theories of bounded arithmetic in 
more detail. In Section 3, we define GL*. This includes the definition of 
the T.CNF{2) formulas. In Section 4, we change the theory VL and prove 
a normal-form that is necessary for our results. This is the most technical 
section in the paper. In Section 5, we prove the translation theorem. In 
Section 6, we prove that GL* is sound in the theory. This includes an 
algorithm to evaluate T,CNF{2) formulas in L. 

This paper is an expanded version of the author's earlier paper [13]. 

2 Basic Definitions And Notation 
2.1 Two-Sorted Computational Complexity 

In this paper, we use two-sorted computational complexity. The two sorts 
are numbers and binary strings (aka finite sets). The numbers are intended 
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to range over the natural numbers and will be denoted by lower-case 
letters. For example, i, j, x, y, and z will often be used for number 
variables; r, s, and t will be used for number terms; and /, g and h will 
be used for functions that return numbers. The strings are intended to 
be finite strings over {0, 1} with leading removed. Since the strings are 
finite, they can be thought of as sets where the ith bit is 1 if i is in the 
set. The strings will be denoted by upper- case letters. The letters X,Y , 
and Z will often be used for string variables. 

We focus on the complexity class L. Let R{x, X) be a relation. If we 
are going to solve this relation on a Turing Machine M, then the input to 
M will be X in unary and X as a series of binary strings. So the size of 
the input is x + \X\. We say i? is in L if can be decided by a two-tape 
Turing Machine such that one tape is a read-only input tape, and less 
than 0(log(^ -I- \X\)) squares are visited on the other tape. 

For functions, we say a number function f{x, X) is in FL if there is a 
polynomial p such that f{x,X) < p{x, \X\), and the relation f{x,X) = y 
is in L. A string function F{x, X) is in FL if the size of F{x, X) is bounded 
by a polynomial and if the relation 

R{i, X, X) ^ the ith bit of F{x, X) is 1 

is in L. This is equivalent to defining FL using a three-tape Turing 
Machine with a write-only output tape. 

2.2 Two-Sorted Bounded Arithmetic 

Besides two-sorted computational complexity, we also use the two-sorted 
bounded arithmetic. The sorts are the same. This notation was base on 
the work of Zambella in [15], but we follow the presentation of Cook and 
Nguyen from [4, 6]. 
The base language is 

£^ = {0,1,-F,x,<,=,=2,€,||}. 

The constants and 1 arc number constants. The functions -|- and x 
take two numbers as input and return a number-the intended meanings 
are the obvious ones. The language also includes two binary predicates 
that take two numbers: < and =. The predicate =2 is meant to be 
equality between strings, instead of numbers. In practice, the 2 will not 
be written because which equality is meant is obvious from the context. 
The membership predicate £ takes a number i and a string X. It is 
meant to be true if the ith bit of X is 1 (or i is in the set X). This will 
also be written as X{i). The final function \X\ takes a string as input 
and returns a number. It is intended to be the number of bits needed to 
write X when leading zeros are removed (or the least upper bound of the 
set X). The set of axioms 2BASIC is the set of defining axioms for C\. 
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X{y) D 2/ < |X| 
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2/+l = |X|'dX(j/) 



SE. X = Y ^ [\X\ = |y| A Vi < \X\{X{i) ^ Y{i))] 

We use 3X < b (f> as shorthand for 3X[(|Xt < fe) A ((>]. The shorthand 
yX < b (j) means VX[(|X| < 6) D 0]. The set Ef = is the set 
of formulas whose only quantifiers are bounded number quantifiers. For 
i > 0, the set Sf is the set of formulas of the form 3X < tcf) where 4> is 
a formula. For i > 0, the set Ilf is the set of formulas of the form 
yX < t(f) where </> is a T.f_i formula. 

Now we can define two important axiom schemes: 

$-COMP: 3X < b\/i < b[X{i) ^ (/)({)], 

$-IND: [(f)(0) Ayx< b[(P{x) D + 1)]] D (f,{b) 

where $ is a set of formula and € 3>, and, for Ef -COMP, does not 
contain X, but may contain other free variables. 

We can now define the base theory. 

Definition 2.1. The theory F° is axiomatized by the 2BASIC axioms 
plus E|*-COMP. 

It is possible to show that V° proves Ef-IND (Corollary [6]). This 
theory is typically viewed at the theory that corresponds to AC° reason- 
ing. 

From time to time, we will use functions symbols that are not in C\. 
The first is X{i,j) = X{{i,j)), where {i,j) = (i + + j + 1) + 2j is 
the pairing function. It can be thought of as a two dimensional array of 
bits. The second is the row function. The notation we use is XM. This 
functions returns the ith row of the two dimensional array X. In the same 
way, we can also describe three dimensional arrays. We also want to pair 
string. So if X = {Yi,Y2), then = Yi and Xl^) = Y2. Note that, 
if we add these functions with their E^ defining axioms to any theory T 
extending V", wc get a conservative extension. They can also be used 
in the induction axioms [4]. This means that, if there is a T proof of a 
formula that uses these functions, there is a T proof of the same formula 
that does not use these functions. 

To get a theory that corresponds to L reasoning, we add an axiom that 
says there is an output to a function that is complete for L with respect 
to AC" reductions. This is a specific example of the method used in [4] 
to construct a theory for a given complexity class. The theory we define 
is E^-rec from [16], but we will call it VL. The complete function we use 
is: Given a graph with edge relation 4>{i,j) and nodes {0, . . . ,a}, where 
every vertex in the graph has out-dcgroo at least 1, find a path of length 
b. This is expressed using the Eg -rec axiom: 

< a3y < a(t>{x, y) D 3Z, Vu) < &(/>(/(«, w, Z), f{a, w + 1, Z)) (E^-rec) 
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where f{a,w,Z) = min {Z{w,x) V x = a) and (f) is a S(f formula. The 

X 

idea is that the function f{a, w, Z) extracts the wth node in the path that 
Z encodes. 

Definition 2.2. The theory VL is the theory axiomatized by V° plus 
E^-rec. 

The E(f -rec axiom has the disadvantage that the path can start at any 
node. However, as Zambella pointed out in [16], to is possible to prove 
that there is a path of length b starting at a particular node a. 

Lemma 2.3. Let E he the edge relation for a directed graph on the nodes 
{0, .... n — 1}. Then for all a < n and b, VL proves, if Vi < n3j < 
n E{i,j), then there is a path of length b starting at node a. 

Proof. Define (j)[{w,i), (w' as 

(f>i{w,i), {w\j)) = {w' = w+1 mod b+l)A{w' / D E{i,j))A{w' = D j = a). 

Take a path of length 2b in the graph of <j}. At some point in the first half 
of that path, the path passes through the node (0, o). Starting from there 
we can extract a path of length 6 in that starts at node a. □ 

2.3 A Universal Theory For L Reasoning 

Another way to get a theory for L is to define a universal theory with a 
language that contains a function symbol for every function in FL. Then, 
we get a theory for L by taking the defining axioms for these functions. 
This is the idea behind other universal theories like PV and V^. In our 
case, we characterize the FL functions using Lind's characterization [10] 
adjusted for the two-sort setting. 

In the next definition, we define the set of function symbols in Cfl 
and give their intended meaning. 

Definition 2.4. The language Cfl is the smallest language satisfying 

1. C\ U {pd, mill} is a subset of Cfl and have defining axioms 2BASIC, 
and the axioms 

pd{0) = (2.1) 
pd{x+l)=x (2.2) 
min{x, y) = z*r^{z = xAx<y)W{z = yAy<x) (2-3) 

2. For every open formula a{i,x,X) over Cfl and term t{x,X) over 
C\, there is a string function Fa,t in Cfl with bit defining axiom 

Fa,t{x,X){i) ^ i < t{x,X) Aa{i,x,X) (2.4) 

3. For every open formula a{z,x,X) over Cfl and term t{x,X) over 
C\, there is a number function f^t in Cfl with defining axioms 

/a.t(^,X) <t(f,X) (2.5) 
z < t{x,X) Aa{z,x,X) D a{fa,t{x,X),x,X) (2.6) 
z < fct (x, X) D -.«(«, X, X) (2.7) 
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4. For all number functions g{x,X) and h{p,y, x, X) in £pl and term 
t{y,x,X) over jC\, there is a number function fg^h,t{y,x, X) with 
defining axioms 

/,,h,,(0,f,X) =min(g(f,X),i(f,X)) (2.8) 
+ 1, f , X) = niin(/t(/(y, f , X, y, x, X)),t{x, X)) (2.9) 

The last scheme is called p-boundcd number recursion. The p-bounded 
number recursion is equivalent to the iot;-bounded string recursion given 
in [10]. The other schemes come from the definition of Cpj^QO in [4]. 

It is not difficult to see every function in Cfl is in FL. The only 
point we should note is that the intermediate values in the recursion arc 
bounded by a polynomial in the size of the input. This means, if we store 
intermediate values in binary, the space used is bounded by the log of the 
size of the input. So the recursion can be simulated in L. To show that 
every FL function has a corresponding function symbol in Cfl, note that 
the p-bounded number recursion can be used to traverse a graph where 
every node has out-degree at most one. 

Definition 2.5. VL is the theory over the language Cfl with Bl-Bll, 
SE, plus 2.1; 2.2; 2.3; axiom 2.4 for each string function Fa,t in Cfl', 
axioms 2.5, 2.6, and 2.7 for each number function fa,t in Cfl; and axioms 
2.8 and 2.9 for each number function fg,h,t in Cfl- 

An open(/I) formula is a formula over the language C that does not 
have any quantifiers. 

The important part of this theory is that it really is a universal version 
of VL. 

Theorem 2.6. VL is a conservative extension of VL. 

Proof. First to prove that VL is an extension of VL. All that is required 
is to prove the Sq^-COMP and So*-rec axioms. To prove Sq^-COMP, note 
that every formula (f> is equivalent to an open formula (j)' . For example, 

VL h 3z < hrh{z,x,X) ^ i>{f.4,,b{x,X),x,X) 

when ip is an open formula. Then the function -F^',t is the witness for 

3Z < t\/i < t[Z(i) ^ 0(i)]. 

To prove the E|f -rec axiom, we can define a function f{i, a, E) that re- 
turns the ith node in the path the axiom says exists. The function / can 
be defined using p-bounded number recursion. From there, a function 
witnessing the -rec axiom can be defined. 

To prove that the extension is conservative, we show how to take any 
model M of VL and find an expansion that is a model of VL. The idea is 
to expand the model one function at a time. We can order the functions in 
Cfl such that each function is defined in terms of the previous functions. 
Let Ci be the language C\ plus the first i functions in Cfl. Let Mi be 
the model obtained by expanding M to the functions in d. Wo will show 
that the model Moo = U a model VL. A similar proof can be found 
in Chapter 9 of [6] and we will not repeat it here. 

□ 
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2.4 Quantified Prepositional Calculus 

We arc also interested in quantified prepositional proof systems. The 
proof systems we use were originally defined in [9], and then they were 
redefined in [5, 11], which is the presentation we follow. 

The set of connectives are {A, V, 3, V, T, _L}, where T and _L are 
constants for true and false, respectively. Formulas are built using these 
connectives in the usual way. We will often refer to formulas by the 
number of quantifier alternations. 

Definition 2.7. The set of formulas Eq = Ilg is the set of quantifier- free 

prepositional formulas. For i > 0, the set of (n') forrrmlas is the 
smallest set of formulas that contains n?_j (E?_j^) and is closed under 
A, V. existential (universal) quantification, and ii A € H'^ {As S?) then 

^A e i^A G n?). 

The first proof system, from which all others will be defined, is the 
proof system G. This proof system is a sequent calculus based on Gentzen's 
system LK. The system G is essentially the DAG-like, propositional ver- 
sion of LK. We will not give all of the rules, but will mention a few of 
special interest. 

The cut rule is 

A,r — >A r — ^A,A 

'''' f^A 

In this rule, we call A the cut formula. There are also four rules that 
introduce quantifiers: 

3-left _, , ^ — 3-right ■ 



3zA{z),r — >A ° r — >A,3zA{z) 



r-^A^(^ A{B),r 

V-left V-right ■ 



A,\/zA(z) \/zA(z),r — >A 

These rules have conditions on them. In 3-left and V-right, the variable 
X must not appear in the bottom sequent. In these rules, x is called the 
eigenvariable. In the other two rules, the formula B must be a formula, 
and no variable that appears free in B can be bound in A{x). 

The initial sequents of G are sequents of the form — > T, ± — >, or 
X — > x, where x is any propositional variable. A G proof is a series 
of sequents such that each sequent is either an initial sequent or can be 
derived from previous sequents using one of the rules of inference. The 
proof system Gf is G with cut formulas restricted to E? formulas. 

We define G* as the treelike version of G. So, a G* proof is a G proof 
where each sequent in used as an upper sequent in an inference at most 
once. A G; proof is a G* proof in which cut formulas are prenex E^ . In 
[11], it was shown that, for treelike proofs, it did not matter if the cut 
formulas in G* were prenex or not. So when we construct G^ proofs, the 
cut formulas will not always be prenex, but that does not matter. 

To make proofs simpler, we assume that all treelike proofs are in free- 
variable normal form. 



8 



Definition 2.8. A parameter variable for a G* proof tt is a variable that 
appears free in the final sequent of tt. A proof tt is in free-vanable normal 
form if (1) every non-parameter variable is used as an eigenvariable exactly 
once in tt, and (2) parameter variables are not used as eigcnvariablos. 

Note that, if a proof is treelike, we can always put it in free-variable 
normal form by simply renaming variables. In fact, VPV proves that 
every treelike proof can be put in free-variable normal form. 

A useful property of these proof systems is the subformula property. It 
can be shown in VL that every formula in a G* proof is an ancestor (and 
therefore a subformula) of a cut formula or a formula in the final sequent. 
This is useful because it tells us that any non-E| formula in a G* proof 
must be an ancestor of a final formula. 

2.5 Truth Definitions 

In order to reason about the proof systems in the theories, we must be 
able to reason about quantified propositional formulas. We follow the 
presentation in [8, 9, 5]. 

Formally formulas will be coded as strings, but we will not distinguish 
between a formula and its encoding. So if F is a formula, we will use F as 
the string encoding the formula as well. The method of coding a formula 
can be found in [5]. 

In this paper, we are only interested in Eg formulas and prcncx 
formulas. For Eg formulas, we are able to give an T,q{Cfl) functions 
that evaluates the formula. This formula will be referred to using A \=o 
F, where A is an assignment and is a formula. We leave the precise 
definition to the readers. 

Given a prenex E^ formula F, the truth definition is a formula that 
says there is an assignment to the quantified variables that satisfies the 
Eg part of the formula. This formula will be referred to as ^ [=1 F. 

Valid formulas (or tautologies) are defined as 

TAUTi(F) = VA, ("A is an assignment to the variables of F" ZD A\=i F) 

This truth definition can be extended to define the truth of a sequent. 
So, if r — > A is a sequent of E? U 11? formulas, then 

{A \=i V — > A) = "there exists a formula in V that A does not satisfy" 
V "there exists a formula in A that A satisfies" 

Another important formula we will use is the refiection principle for a 
proof system. We define the E^ refiection principle for a proof system P 
as 

E^RFN(P) = VFVtt, ("tt is a P proof of F" A F £ E«) D TAUT,{F) 

This formula essentially says that, if there exists a P proof of a E? formula 
F, then F is valid. Another way of putting it is to say that P is sound 
when proving E| formulas. 
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2.6 Propositional Translations 

There is a close connection between the theory V* and the proof system 
Gi . You can think of G* as the non- uniform version of . This idea might 
not make much sense at first until you realize you can translate a proof 
into a polynomial-size family of G| proofs. The translation that we use 
is described in [4, 5]. It is a modification of the Paris- Wilkie translation 
[12]. Given a Sf formula (f>{x,X) over the language we want to 
translate it into a family of propositional formulas ||<^(x, X)| | [m; n], where 
the size of the formulas is bounded by a polynomial in m and n. The 
formula ||0(a;, X)||[m; n] is meant to be a formula that is a tautology when 
4>{x,X) is true in the standard model whenever Xi = mi and \Xi\ = m. 
Then if 0(a;, X) is true in the standard model for all x and X, then every 
I \4>{x, X) 1 1 [m; n] is a tautology. 

The variables in and n will often be omitted since they are understood. 
The free variables in the propositional formula will be p^' for j < Ui — 1. 
The variable p'^^ is meant to represent the value of the jt\i bit of Xi\ we 
know that the riith bit is 1, and for j > rii, we know the jih bit is 0. The 
definition of the translation proceeds by structural induction on <p. 

Suppose (j) is an atomic formula. Then it has one of the following 
forms: s — t, s < t, Xi{t), or one of the trivial forrrmlas 1. and T, for 
terms s and t. Note that the terms s and t can be evaluated immediately. 
This is because the exact value of every number variable and the size of 
each string variable is known. Let val{t) be value of the term t. 

In the first case, we define ||s = t\\ as the formula T, iival{s) = val{t), 
and ±, otherwise. A similar construction is done for s < t. If </) is one 
of the trivial formulas, then \\4>\\ is the same trivial formula. So now, if 
<p = Xi{t), let j = val{t). Then the translation is defined as follows: 

pf' if j <ni-l 
1 if j = ni — 1 
if j > m - 1 

Now for the inductive part of the definition. Suppose 4> = aAf3. Then 

||<^|| = ||a||At|/3||. 

When the connective is V or -■, the definition is similar. If the outermost 
connective is a number quantifier bound by a term t, let j = val{t). Then 
the translation is defined as 



i=0 

\\yy<t,a{y)\\^ /\\\a{ymi 



\\3Y < t,a{Y)\\ =3pl , . . . ,3pl_2,\/ \\a{Y)\\[i] 

i=0 

3 

||VF < t,a{Y)\\ =ypo,-- . ,VpI_2, /\ ||a(y-)|l[i] 
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Now wc arc able to state the translation theorem for and G* . 

Theorem 2.9. Suppose V h 4>{x,X), where 4> is a bounded formula. 
Then there are polynomial- size G* proofs of the family of tautologies 

\\4>{S,X)\\[rh;n]. 

This type of theorem is the standard way of proving that the reasoning 
power of the proof system is as least as strong as that of the theory. 

3 Definition of GL* 

In this section, we will define the proof system we wish to explore. As was 
stated in the introduction, this proof system is defined by restricting cut 
formulas to a set of formulas that can be evaluated in L. Alone that is 
not enough to change the strength of the proof system, so we also restrict 
the use of cigenvariablcs. 

The first step is to define a set of formulas that can be evaluated in 
L. These formula will be bases on CNF{2) formulas. A CNF{2) formula 
is a CNF formula where no variable has more than two occurrences in 
the entire formula. It was shown in [7] that determining whether or not a 
given CNF{2) formula is satisfiable is complete for L. Based on this we 
get the following definition: 

Definition 3.1. The set of formulas TiCNF{2) is the smallest set 

1. containing Sq, 

2. containing every formula 3z,(f>{z,x) where (1) <^ is a quantifier- free 
CNF formula A2=i and (2) existence of a 2-literal I in d and Cj, 
i =^ j, implies existence of an a;- variable x such that x € Gi and 
-ix G Cj or vice versa, and 

3. closed under substitution of Eq formulas that contain only a;- variables 
for a;- variables. 

Definition 3.2. The idea behind this definition is that any assignment 
to the variables x reduces the quantifer-free protion to a CNF{2) formula 
in z. GL* is the propositional proof system Gl with cuts restricted to 
TiCNF(2) formulas in which every free variable in a non-Ej cut formula 
is a parameter variable. 

The restriction on the free variables in the cut formula might seem 

strange, but it is necessary. If wc did not have this restriction, then the 
proof system would be as strong as Gi. We will not give a full proof of 
this, but the interested reader can see information on GPV* in [14]. What 
we will show is that, if the restriction on the variables is not present, then 
the proof system can simulate G\ for E' formulas. 

Let H* be the proof system C\ with cuts restricted to T,CNF{2) 
formulas and no restriction on the free variables. 

Definition 3.3. An extension cedent A is a sequence of formulas 

A s 2/1 f-^- Si,t/2 <->■ ^2,. . . ,y„ f-^- S„ (3.1) 

where Bi is a Eg formula that does not mention any of the variables 
Vi, ■ ■ ■ ,yn- We call the variables j/i, . . . , j/n extension variables. 
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Based on a lemma in [8] , Cook and Nguyen proved the following lemma 
in [6]. 

Lemma 3.4. If tt is a Gl proof of3zA{z,x), where A is aT,^ formula, 
then there exists a PK proof tt' of 

A — >A{y,x) 

where A is as in 3.1 and \n'\ < p(|7r|), for some polynomial p. 

The proof guaranteed by this lemma is also an H* proof since every 
PK proof is also an H* proof. Extending this proof with a number of 
applications of 3-right, we get an H* proof of 

A — y3zA{z,x). (3.2) 

So now we need to find a way to remove the extension cedent A. This 

is done one formula at a time. Suppose y ^ B is the last formula in A. 
The key observation is that 3y[y ^ B] is a N F{2) formula because 
the formula can be express as 3j/[(3/V ^ B) A V B)]. So we can apply 
3-left with y as the eigenvariable to (3.2). The eigenvariable restriction 
is met because y is the last eigenvariable, and, therefore, cannot appear 
anywhere else the extension cedent. Then we cut 3y[y ^ B] after deriving 
— > 3y[y <-> B]. We can then do this for every formula is A starting at 
the end. This proves the following theorem. 

Theorem 3.5. H* p-simulates Gl for formulas. 

This proof is not always a GL* proof because the extension variables 
are not parameter variables, yet they appear in cut formulas. 

4 Adjusting VL 

In order to prove the translation theorem, we start with the theory VL, 
which corresponds to L reasoning. This theory was defined in Section 2.2. 
The proof of the translation theorem is similar to other proofs of its type. 
We take an anchored (or free-cut free) proof. Then the cut formulas in 
this proof will translate into the cut formulas in the propositional proof. 
If we use VL for this, there are two problem: (1) not all of the ax;ioms of 
VL translate into T,CNF{2) formulas and (2) the restriction of the free 
variables in cut formulas may not be met. In the first subsection, we take 
care of the first problem. The second problem in taken care of in Section 
4.2. 

4.1 A New Axiomatization For VL 

We want to reformulate the axioms of VL so they translate into TiCNF{2) 
formulas. All of the 2BASIC axioms are , so they translate into Eq 
formulas, which are ECiVF(2), so they do not create any problems. We 
only need to consider S^-COMP and E,f -rcc. We handle S,f -COMP the 
same way Cook and Morioka did in [5]. That is, if the proof system is asked 
to cut the translation of an instance of the E^-COMP axiom, then the 
propositional proof is changed so that the cut becomes A<=o[ll'^(*)ll 
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which is T,CNF{2). To take care of S(f -ree, we define a new 
theory that is equivalent to VL by replacing the E(f-rec axiom. 

Informally the new axiom says that there exists a string Z that gives 
a specific pseudo-path of length b in the graph with a nodes and edge 
relation (j>{i,j). This path starts at node 0. If {i,j) is an edge in this 
path, then j is the smallest number with an edge from i to j, or j = a 
when there are no outgoing edges. Note that the edge may not exist in 
the original graph when j = a. This is why we call it a pseudo-path. If 
is the loth edge in the path, then Z{w,i,j) is true, and Z{w,i',j') 
is false for every other pair. This is described by the T,q -edge-rec axiom 
scheme: 

^Z < 1 + (6, o, o) [pi A p2 A p3 A p4 A p6 A p6 A p7 A ps] , (^o -edge-rec) 
where 

pi =Vj < a, ^Z{0, 0, j) V <P{0,j) V3l< i0(O, I)) 

P2 =Vj < dik < j, -^Z{0, 0,j) V ^(/.(O, fe) V 3/ < k(j){0, 1)) 

P3 =yi < aVj < a, i = V -.^(0, i, j) 

pi =iw < Wi < aij < a, -^Z{w + 

V3h< aZ{w, h, i) V ^<^(i, j) V 3/ < jXi, /) 
P5 =Vm; < Vii < aVj < a, -^Z{w + V (l>ii,j) V 3/ < I) 

P6 =yw < Wi < aVj < aVfc < j, ^Z(w -\- 1, V ^(j>{i, fc) V 3Z < k^{i, I) 
pr =3i < a3j < a, Z{b, i, j) 

Ps =V(w, i,j) < {b,a,a),[w > bW i > aV j > a] D -^Z(w, 

and j) is a formula that does not mention Z, but may have other 
free variables. It is not immediately obvious that the axiom says what it 
is suppose to, so wc will take a closer look. 

Let Z he a string that witnesses the axiom. We want to make sure 
Z is the path described above. Looking at pa, we see the path starts 
at 0. Suppose Z{0,0,j) is true. We must show that j is the first node 
adjacent to 0. This follows from pi, which guarantees is true when 

j < a, and p2, which guarantees (f>{i,k) is false when k < j. A similar 
argument can be made with ps and pe to show that every node is the 
smallest node adjacent to its predecessor. To make sure the path is long 
enough, we have pr, which says there is a 6th edge, and p4, which says if 
there is a (w -|- l)th edge there is a wth. As you may have noticed, there 
are parts of this formula that semantically are not needed. For example, 
the 31 < j4>{0, 1) in pi is not needed. It is used to make sure the axiom 
translates into a T.CNF{2) formula. We add ps to make sure there is a 
unique Z that witnesses this axiom. 

Notation 1. For simplicity, V'^ is the part of the Eo'-edge-rec axiom 
instantiated with (j). Note this includes the bound on the size of Z. So 

the axiom can be written as 3Ztp^. 

Definition 4.1. VL' is the theory axiomatized by the axioms of , 
the E^-edge-rec axioms, and Axiom (4.1). The language of VL' is the 
language of V° plus a string constant C with defining axiom 

\C\ = (4.1) 
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Wo add the string constant to the language so we can put VL' proofs 
in free variable normal form (below). We do not use the constant for any 
other reason. Also, in the translation, we can treat C as a string variable 
with n — 0. 

Lemma 4.2. The theory VL is equivalent to VL' . 

Proof. To prove the two theories are equivalent, we must show that VL 
proves the -edge-rec axiom and that VL' proves the -rec axiom. 
Since the two axioms express similar ideas, this is not surprising. 

To show that VL proves the E(f -edge-rec axiom, let (f>(i,j) be any 
E^ formula. Then let Y be the string such that Y{i,j) ^ (j < a D 
(t>{i,j)) /\^k < j^(j>{i, k). This Y exists by E^-COMP. We can think of Y 
as the graph that contains only the edges the E^ -edge-rec axiom would 
use. Since VL proves the X — MIN formula, it follows that VL proves 
Vi < a,3j < a,Y{i,j). This means there exists a path of length b in Y 
that starts at node Lemma 2.3. It is a simple task to verify the b edges 
in this path satisfy the E^ -edge-rec aociom for 0. 

To show that VL' proves the E^-rec axiom, let (p{i,j) be a E^ formula 
such that Vi < a3j < a,4>{i.j). By the E(f -cdgc-rcc axiom, there is a 
pseudo-path of length b in the graph <j}. We need to show that this is a 
real path. Suppose is an edge in the path. If j < a, then {i,j) is 
in the graph by pi and ps. Otherwise, j = a, and V/s < fc). This 

implies </>(«, j) since every node has out-degree at least 1. This means 
every edge in the pseudo-path exists, and there exists a path of length 
b. □ 

The next step is to be sure the translation of the E^ -edge-rec axiom is 
a T,CNF{2) formula. This is done by a careful inspection of the formula. 

Lemma 4.3. The formula \\3Zi;^{a,b, Z)\\ is a T.CNF{2) formula. 

Proof. First we assume (f>{i,j) = X{i,j) for some variable X. It is easy to 
see that \\%l)x{i,j){a,b, Z)\\[a.,b;t.,a*a\, where t is the bound on Z given in 
the E^-edge -rec axiom, is a CNF formula. Note that we assigned \Z\ = t 
and \X\ = a * a. We now need to make sure the clauses have the correct 
form. This is done by examining each occurrence of a bound literal. To 
verify this, the proof will require a careful inspection of the definition of 
the a:xiom. The only bound variables are those that come from Z. These 
are ^ , , which we will refer to The only free variables are 

those corresponding to X. These variables will be referred to as Xij. 
We will first look at the positive occurrences of Zw,i,j. On inspection, 
we can observe that, when w < b, every occurrence of Zw,i,j must be in 
clauses that are part of the translation of p4. We want to show that every 
clause that is part of the translation of p4 has conflicting free variables. 
This is true since -iX{i,ji) will conflict with one of the variables from 
31 < j2,X{i,l) when ji < j2. When w = b, the variable Zb,i,j appears 
once in pr. Now we turn to the negative occurrences. When w = 0, the 
variable zaA,j will appear negatively in the clauses corresponding to pi, 
P2, and p3. If i > 0, it will appear only in the clauses corresponding to pa 
and will appear only once. If i = 0, the variable ^0,0,3 will not appear in 
the translation of pa because the i = part will satisfy the clause. It is 
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easy to observe that every occurrence of the variable in the translation of 
pi and p2 will have a conflicting free variable. Examine the construction 
X(0,i) V 3/ < jX(0, 1) at the end of pi and ^X(0, fc) V 3/ < fcX(0, 1) at 
the end of p2. A similar argument can be made with p4, ps, and pe when 
w > 0. This implies that the translation is a T,CNF{2) formula when 
(j>{i,j) = X{i,j). When 4> is a more general formula, the translation is 
the formula in the first case with the free variables substituted with the 
translation of 0, which will be E^. Since Y,CNF{2) formulas are closed 
under this type of substitution, the formula is T.CNF(2) in all cases. □ 

4.2 Normal Form For VL' 

In this section, we want to find a normal form for VL' proofs that makes 
sure the translation of VL' proofs satisfy the variable restriction for GL* . 
The normal form we want is cut variable normal form (CVNF) and is 
defined in the following. 

Definition 4.4. A formula (^(Y) is bit-dependent on Y if there is an 
atomic sub-formula of cf> of the form Y{t), for some term t. 

Definition 4.5. A proof is in free variable normal form if (1) every non- 
parameter free variable y or Y that appears in the proof is used as an 
eigenvariable exactly once and (2) parameter variables are never used as 
eigenvariables. 

Note that if a proof is in free variable normal form we can assume that 
every instance of the non-parameter variable Y (or y) is in an ancestor 
of the sequent where Y is used as an eigenvariable. If it is not, we can 
replace Y with the constant C in all those sequents. 

Definition 4.6. A cut in a proof is anchored if the cut formula is an 
instance of an axiom. 

Definition 4.7. A VL' proof n is in cut variable normal form ifn is (1) 
in free variable normal form, (2) every cut with a non-TiQ cut formula is 
anchored, and (3) no cut formula that is an instance of the -edge-rec 
axiom is bit-dependent on a non-parameter free string variable. 

It is known how to find a proof with the first two properties [6, 2], and 
this part will not be repeated here. Instead we focus on how to find a 
proof satisfying the third property. 

Theorem 4.8. For every Ef theorem ofVL' there exists a VL' -proof of 
that formula in CVNF. 

The proof of this theorem is the most technical in this paper. At a 
high level, it amounts to showing T,q -edge-rec is closed under substitution 
of strings defined by E(f -edge-rec and E^-COMP. We begin with an an- 
chored proof that is in free variable normal form. We want to change every 
cut that violates condition (3) in the definition of CVNF. Consider the 
proof given in Figure 1. This is a simple example of what can go wrong. 
The general case is handled in the same way, so we will only consider this 
case. 

Since all Ef cut formulas arc anchored and the 3Y-y{Y) must even- 
tually be cut, it is be an instance of E^ -COMP or E^ -edge-rec. So you 
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7(y),r— . A 

3Yj{Y),r^A 
Figure 1: Example of a proof that is not in CVNF 

can think of 7 as a formula that completely defines Y. Then we want to 
change (t>{Y) so that it does not mention Y explicitly, but instead uses the 
definition of Y given by 7. Note that, for this to be true, the final formula 
must be Ef ; otherwise, Y could have been used as an eigenvariable in a 
V-right inference and would not be well defined. 

Lemma 4.9. For any E|f formula 4>{Y), there exist T,q formulas and 
4>2 such that (f>i IS not bit- dependent on Y and V'^ proves the sequent 

7(F),^^i(Z),Vi < t[Z'ii) ^ MZ)] i'HY){Z'). 

Proof. This proof is divided into two cases. In the first case, we assume 

7(F) s |y I < t A Vi < t[Y{i) *^ (t)'{i)]. (4.2) 

That is, 3Y'y{Y) is an instance of -COMP. We know Y must appear 
in that position because it eventually gets quantified. In this case, 4>i is 
with every atomic formula of the form Y{s) replaced by s < t A <j}' (s), 
and 02 is the formula Z{i). We can prove that there exists a proof of 
(4.2) by structural induction on 0. 

For the second case, wc assume 7(F) = ^p^i{Y). That is, Y is the 
pseudo-path in the graph of (f>' . The first step is to define branching 
programs that compute Y and Z' (the pseudo-path in the graph of <j)) using 
Y. Then 0i is the description of the composition of these branching 
programs, and 02 is the E^ formula that extracts Z' from the run of this 
last branching program. 

Definition 4.10. A branching program is a nonempty set of nodes labeled 
with triples (a,i, j), where a is a Eg formula over some set of variables 
and < i, j < t for some term t that depends only on the inputs to the 
program. Semantically, if a node u is labeled with {a,i,j), then, when 
the branching program is at node u, it will go to node i, if a is true, or 
node j, otherwise. The initial node is always 0. 

Note that a branching program is essentially a graph with a special 
form, and, as with graphs, we use families of branching programs that 
can be described by Ef formulas. However, we will not give the explicit 
construction of the formula; we leave it to the reader. 

The first step is to introduce the initial branching program BPq that 
computes Z' . The nodes of BPq are interpreted as triples {w,i,j). An 
invariant for this branching program is that, if we reach the node (w,i,j), 
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then the wth node of Z' is i and Vfc < j^<l>(i, k). At each node, we check 
if j is the next node. Let a be the maximum value of a node and b be 
the length of the path. This means the number of nodes in BPo is bound 
by (6, a, a). So now to define the labels. If j < a, then {w,i,j) is labeled 
with {(f){i,j), {w + l,j, 0), {w,t,j + 1)). If j — a, then {w,i,j) is labeled 
with (T, {«; + 1, j, 0), 0). It is easy to see that the invariants hold and that 
Z' can be obtained from a path in BPo using Eq-COMP. 

The branching program that computes Y is constructed the same way 
except (j)' is used instead of cp. Let this branching program be BP. 

Moving on to the second step, we now want to simplify BPo so that 
every node whose label is bit-dependent on Y is labeled with an atomic 
formula. This is done to simplify the construction of the composition. 
We start with BPq. Then, given BPi, we define BPi+i by removing one 
connective in a node of BPi that is not in the right form. Let node n 
in BPi be labeled with (a,ui,W2). The construction is divided into five 
cases: one for each possible outer connective. 

Case a = -i/3; BPi+i is the same as BPi except node n is now labeled 
with W2, Ml). 

Case a = Pi A P2: The nodes of BPi+i are interpreted as pairs (u, v). 
The node {u, 0) corresponds to node u in BPi. The label of {n, 0) becomes 
(n, 1), (u2, 0)) and the label for (n, 1) is (/32, (mi, 0), (m2, 0)). Notice 
that (n, 1) is used as an intermediate node while evaluating a. 

Case a = piW (32: BPi+i is defined as in the previous case, with a few 
minor modifications. This case is left to the reader. 

Case a = 3z < tp(z): The nodes become pairs as in the previous 
case, but this time the labels are different. The node in,i) is labeled with 
(iii,0), {n,i + 1)), when i < t. 11 i = t, the node is labeled with 
(iti,0), ('U2,0)). In this case, the branching program is looking for 
an i that satisfies /3(i). 

Case a =\/z < t(3{z): This case is similar to the previous case. The 
only difference is the branching program is looking for an i that falsifies 

Let BPn be the final branching program in this construction above. 
We now construct a branching program BP' that is the composition of 
BPn and BP. The nodes of BP' are pairs (mi,'U2) where the first element 
corresponds to a node in BPn and the second element corresponds to a 
node in BP. 

Suppose node ui in BPn is labeled with {a,vi,V2). If a is not bit- 
dependent on Y , then the node (mi,0) is labeled with (a, (fi,0), (w2,0)). 

It is also possible that a is bit-dependent on Y; in which case, a is of 
the form Y{w, Let (/3, wi, W2) be the label for node U2 in BP. Then 
the node (mi,M2) is labeled as follows: 

(wi, wi), {ui,W2)), if U2 < (ui,a,a) and U2 ^ {w,i,j), 
(/3, {vi,0}, (V2,0)) if U2 = {w,i,j), and 
(T, (v2, 0), {v2,0}) otherwise. 

In this case, we axe using the second element to run BP and determine 
if the wth edge in the path is If it is, we move on to (vi, 0), and, if it 

is not, we move on to {v2, 0). In the labels above, the first line corresponds 
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^^(K)(z),7(y),r ^ A 7(y),^^,(z),T(zo A,i,^^Y){z) 

^^AZ),r{Z'),j{Y),r ^ A 
i,^,iZ),3Z'TiZ'),^iY),r ^ A 

jZ), 3Z't{Z'), 7(y), r ^ A ^ 3Z't{Z') 
i'^dZ),liY),r ^ A 
3ZMZ),l{y),T ^ A 

3Zi,^,{Z),j{Y),r ^ A ^3Z^^,{Z) 

7(y),r— > A~ 

3Yj{Y),T^A 

gurc 2: Modification of the proof in Figure 1. The formula t{Z') is used 
place Vi < t[Z'{i) ^ 02 (^)] 



to running BP. The second Une corresponds to a check if is the wth 
edge. Tiic tiiird line is used when we have already found the wth edge 
and it is not 

It is not difficult to see that it is possible to construct 01 (a formula 
describing BP'), and 02 (a formula extracting Z' from a run of BP'. 
Moreover, V° proves that this construction works. □ 

Using this lemma, we are able to change the proof in Figure 1 into 
the proof in Figure 2. In that proof, P' is the proof P with the rules 
that introduced 3Z ignored (renaming variables if necessary), and Q is an 
anchored V° proof, which we know exists by the lemma above. This gives 
us a now proof of the same formula that still satisfies properties (1) and 
(2) in Definition 4.7 and it contains one less cut that is bit-dependent on 
Y. 

Using this manipulation, we prove Theorem 4.8. 

Proof of Theorem 4-8. It would be nice to be able to simply say we can 
repeatedly apply the manipulations above and eventually the proof will 
be in CVNF, but this is not obvious. In the manipulation, if 'y^Y) is 
bit-dependent on a string variable other than Y, then the new -edge- 
rec cut formula is bit-dependent on that variable. This includes non- 
parameter string variables. So we need to state our induction hypothesis 
more carefully. 

Let Yi, . . . ,Yn be all of the non-parameter free string variables that 
appear in n ordered such that the variable Yi is used as a eigenvariable 
before Yj for i < j. This implies Yi does not appear in 7(lj) in the 
manipulations above. So now suppose no T,q -edge-rec cut formula is bit- 
dependent on the variables Yi,...,Yk, for some k < n. Then we can 
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manipulate n such that the same holds for the variables Yi,...,Yk+i. 
To accomplish this, we simply manipulate every Sjf -edgc-rcc cut formula 
that is bit-dependent on Yfe+i as described above. Since Yi, . . . ,Yk cannot 
appear in 7(Yit+i), those variables will not violate the condition. So by 
induction, we can get a proof that is in CVNF. □ 

5 Translation Theorem 

We are now prepared to prove the translation theorem. The proof is done 
by induction on the length of the proof. For the base case, we need to prove 
the translation of the ajcioms of VL'. We know the S^-COMP and the 
2BASIC axioms have polynomial-size G*) proofs from other translation 
theorems [5]. This means they also have polynomial-size GL* proofs. 
Axiom (4.1) is easy to prove since it translates to — > T. We still need to 
show how to prove the E^-edg c-rec axiom in GL* . Recall that we write 
the axiom as 3Z^^{a,b, Z). Note that the axiom does have a bound on 
Z, but it has been omitted since the specific bound is not important. 

Lemma 5.1. The formula \\3Ztp^{a,b, Z)\\ has aGL* proof of size p{a,b) 

for some polynomial p. 

Proof. The proof is done by a brute force induction. We prove, in GL*, 
that, if there exists a pseudo-path of length b, then there exists a pseudo- 
path of length 6-|- 1 . It is easy to prove there exists a pseudo-path of length 
0. Then with repeated cutting we get our final result. The entire path is 
quantified, so we do not cut formulas with non-parameter free variables. 

Given variables that encode a path of length 6, we can define Eq for- 
mulas that determine the next edge. Let Aij = Since ^ is a 
formula, Aij is a Eq formula. To prove that there is an edge that starts 
the path, consider the formula 

k=0 

when j < a, and 

a-l 

Bo,0,a = f\ ~'^0,fe- 
fc=0 

It is easy to see Bo,o,j is true for exactly one j < a. This is also provable 
in GL* . This shows that GL* has a polynomial-size proof of 

\\3Z^^{a,l,Z)\\. 

For the inductive step, if there is a path of length b and the path 
is given by the variables Zw,i,j, then the witnesses for the next edge are 
defined as follows: 

a j — l 

k=0 k=0 
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when j < a, and 



a a — 1 

Bb+l,i,a = \J Zb,k,i A 

k=0 k=0 

Using the fact that exactly one ztAj is true, we can prove in GL* that 
exactly one B^+i^ij is true. This shows that GL* has a polynomial-size 
proof of 

\\3ZiP4,{a,h,Z)\\ \\3ZiP4a,b+l,Z)\\. 

So now we are able to prove \\3Zip^{a,b, Z)\\ for any b by successive 
cutting. Recall that \\3Z'(p^{a,b, Z)\\ is a Y1CNF{2) formula, and note 

that the free variables in | [3Zi/>0(a, 6, Z)|| do not change as b changes. 
This means we are allowed to do the cut. □ 

This can be used to prove the translation theorem. 

Theorem 5.2 {VL-GL* Translation Theorem). Suppose VL proves 3Z < 
t4>{x, X, Z), where (f> is a E(f formula. Then there are polynomial- size GL* 

proofs of\\3Z<t4>{x,X,Z)\\[n]. 

Proof. By Theorem 4.2 and Theorem 4.8, there exists a VL' proof tt of 
3Z < t(t){x, X, Z) that is in CVNF. 

We proceed by induction on the depth of tt. The base case follows 
from Lemma 5.1 and the comments that precede it. The inductive step 
is divided into cases: one for each rule. With the exception of cut, every 
rule can be handled the same way it is handled in the V^-G\ Translation 
Theorem (Theorem 7.51, [6]), and will not be repeated here. 

When looking at the cut rule, there arc three cases. If the cut formula 
is E|f , then we simply cut the corresponding Eg formula in the GL* proof. 
If the cut formula is not Eq , then it must be anchored since the proof is 
in CVNF. This means the cut formula is an instance of E^ -edge-rec or 
an instance of E^-COMP. First suppose it is an instance of E^-edge-rec. 
Then we are able to cut the corresponding formula in the GL* proof. 
This is because the axiom translates into a T.CNF{2) formula, and the 
free variables in the translation are parameter variables since the formula 
is not bit-dependent on non-parameter string variables. 

When the cut formula is an instance of E?-COMP, we apply the same 
transformation as in the proof of the VNC^-G*) translation theorem [5]. 
That is, we remove the quantifiers by replacing the variables with Eq for- 
mulas that witness the quantifiers. This change does not effect other cuts 
since their free variables are parameter variables or they are E^ formulas 
and remain Eq after the substitution. The current cut formula becomes a 
Eq fornmla, which can be cut. Note that, since there are a constant rmm- 
ber of cuts of this a^xiom, the substitution does not cause an exponential 
increase in the size of the formulas. □ 

6 Proving Reflection Principles 

In this section, we show that GL* does not capture reasoning for a higher 
complexity class. This is done by proving, in VL, that GL* is sound. 
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This idea comes from [3], where Cook showed that PV proves extondod- 
Prege is sound, and [9], where Krajicok and Pudlak showed T2 proves Gi 
is sound for i > 0. 

We will actually show that VL proves GL* is sound. The idea behind 
the proof is to give an Cfl function that witnesses the quantifiers in the 
proof. Then wo prove, by S(f (£fl)-IND, that this functions witness every 
sequent, including the final sequent. Therefore the formula is true. 

We start by giving an algorithm that witnesses ECiVF(2) formulas in 
L when the formula is true. This algorithm is the algorithm given in [7] 
with a few additions to find the satisfying assignment. We describe an 
£fl function that corresponds to this algorithm and prove it correct in 
VL. We then use this function to find an Lfl function that witnesses 
GL* proofs, and prove it correct in VL. 

6.1 Witnessing SCiVF(2) Formulas 

Let 3zA{x, z) he a T,CNF{2) formula. We will describe how to find a 
witness for this formula. We assume that yl is a CNF formula. That is, 
the substitution of the Eq formulas has not happened. The general case 
is essentially the same. 

The first thing to take care of is the encoding of A. We will not go 
through this is detail. Suffice it to say that paxsing a formula can be done 
m TC° [5], and, as long as we are working in a theory that extends TC 
reasoning, we can use any reasonable encoding. Wo will refer to the ith 
clause of A as Cf^. A clause will be viewed as a set of literals. A literal is 
either a variable or its negation. So we will write I £ Ci* to mean that the 
literal I is in the ith clause of A. Since the parsing can be done in TC°, 
these formulas can be defined by {£-fl) formulas. An assignment will 
also bo viewed as a set of literal. If a literal is in the set, then that literal 
is true. So an assignment X satisfies a clause C if and only in X n C 7^ 0. 

Given values for x, we first simplify A to get a CNF{2) formula. We 
will refer to the simplified formula as F. This can be done using the jCfl 
function defined by the following formula: 

i£Cf ■^lecf AXncf ^ 0, 

where X is the assignment to the free variables. From the definition of a 
TjCNF{2) formula, VL can easily prove that F now encodes a CNF{2) 
formula. In fact, it can be shown that no literal appears more than once. 
A satisfying assignment to this formula is the witness we want. Mark 
Braverrnen gave an algorithm for finding this assignment [1], but we use 
a different algorithm that is easier to formalize. 

Before we describe the algorithm that finds this assignment, we go 
through a couple definitions. First, a pure literal is a literal that appears 
in the formula, but its negation does not. Next the formula imposes an 
order on the literals. We say a literal Zi follows a literal hz if the clause 
that contains h also contains h, and h is immediately to the right of I2, 
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circling to the beginning if I2 is the last literal. More formally: 
follows{h,h, F) « 3i, heCf Ahe Ci A V/3(/2 <h<hZih^ Cf ) 

h^hiliKhKh^lztCf) 

Note that if a clause contains a single literal then that literal follows 
itself. Also, note that literals are coded by numbers and l\ < I2 means 
the number coding li is less then the number coding hz- 

To find the assignment to F, we will go through the literals in the 
formula in a very specific order. Starting with a literal I that is not a pure 
literal, the next literal is the literal that follows I: 

next{li, F) = b follow s{l2,lii F). 

Note that if li is a pure literal, then there is no next literal, so wo simply 
define it to be itself. The important distinction is that next gives an 
ordering of the literals in a formula, and follows orders the literal in a 
clause. When F is understood, we will not mention F in next and follows. 

The algorithm that finds the assignment works in stages. At the begin- 
ning of stage j, we have an assignment that satisfies the first i — 1 clauses. 
Then, in the ith stage, we make local changes to this assignment to satisfy 
the ith. clause as well. At a high level, to satisfy the ith. clause, we start 
with the first literal in the ith clause, and assign that literal to true. The 
clause that contains this literal's negation may be have gone from being 
satisfied to being unsatisfied. So wo now go to the next literal, which is 
in this other clause. We continue this until we get to a point where we 
know the other clause is satisfied. We need to be able to do this in L. 
Algorithm 1 shows how to do this. At any point in the algorithm, the only 



Algorithm 1 Algorithm for Stage i 
Set li to the first literal in clause i. 
repeat 

Assign true to l\ ■ 

set I2 := next{li) 

while I2 is not the complement of do 
Assign true to I2 

set I2 ■= next{l2) 

If I2 is a pure literal, assign true to I2, and stage i is done. 
If h and I2 are in the same clause, stage i is done, 
end while 

Assign true to li. {This statement is redundant, but it is included to 

emphasis that h is true.} 

set h := next{li) 
until h is the first literal in clause i 
At this point we know the formula is unsatisfiable. 



information we need are the values of h and I2, so this is in L. Note that 
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wc do not store the assignment on the work tape, but on a write-only, 
output tape. What is not obvious is why this algorithm works. 

The next lemma can be used to show that the both loops will eventu- 
ally finish. 

Lemma 6.1. For all literals I, there exists a t > such that after t 
applications of next to I, we get to I or a pure literal. 

Proof. Let next''{l) = I and next*'^^{l) = next{next* (l)) . Since next has a 
finite range, there exist a minimum i and t such that next^il) — next^^*{l). 
Suppose this is not a pure literal. If i > 0, then next{next^^^ (l)) = 
next{nexf^^~^ (l)) . However, this implies nexf^^{l) = next'^'^^~^ (l) since 
next is one-to-one when not dealing with pure literals. This violates our 
choice of i. Therefore i = 0, and I = next°{l) = next* {I). □ 

The implies the inner loop will halt, because, if it does not end earlier, 
I2 will eventually equal /i which both will be in the same clause. For the 
outer loop, if the algorithm does not halt for any other reason, h will 
eventually return to the first literal in the ith clause. 

The next lemma plays a small role in the proof of correctness. 

Lemma 6.2. Suppose the algorithm fails at stage i and that next^{l') = I, 
where I' is the first literal in clause i. Then, for every literal in the same 
clause as I, there is a t' such that next* {I') equals that literal. 

Proof. To prove this lemma, wc will show that there exists a t' that equals 
the literal that follows I. Then by continually applying this argument, you 
get that every literal in the clause is visited. 

Let I' be the first literal in the ith clause. Then, after going through 
the outer loop t times, Zi = I. Since the algorithm fails, the inner loop will 
finish because I2 —h. This means there is a t' such that next* {I') = I. 
Then next* ^'^{l') is the literal that follows I. □ 

Theorem 6.3. If the algorithm fails, the formula is unsatisfiable. 

Proof. This is proved by contradiction. Let F be a CNF(2) formula and 
A be an assignment that satisfies it. Assume that the algorithm fails. 
From this we can defined a function from the set of variables to the set of 

clauses as follows: 

f{i) = j ^ (xi G Cf A .Ti G 4) V {-^Xi G Cf A -iXj G A). 

Informally, if f{i) = j then clause Cj is true because of the variable Xi. 
Since the formula is satisfied, this function is onto the set of clauses. Also, 
since F is CNF{2), no literal appear more than once. So / is indeed a 
function because if f{i) = j and f{i) = j' then the literal Xi or -^Xi is in 
both Cf and Cf,. 

Now we will use the assumption that the algorithm fails to find a way 
to restrict / so that it violates the PHP. Suppose the algorithm fails at 
stage i. Let I be first literal in clause i. We then define sets of variables 
y as follows: 

= < Xn ■ < a next^{l) = Xn V next'' {I) = ^x„ \ . 
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We also defined sets of clauses W* as follows: 



W" = {C„ : 3x € Vix eCnV^xe Cn)} . 

Note that for a large enough a, say \F\, if C„ is in W"', then every variable 
that appears in Cn is in W"' by 6.2. We show by induction on a that 

For a — 1, IV^l — 1. If / is a pure literal or I and -^l are in the same 
clause, then the algorithm would succeed. Otherwise — 2. 

For the inductive case, suppose \V°-\ < Let I' = nexe+\l). If I' 

is not a new variable, then [V+^l = \V''\< [Wl = If Z' is a new 

variable, then I must be in a new clause. For, if this was not the case, 
the algorithm would succeed. To see this, let h be the most recent literal 
in the same clause £is I' . We know h is not I since I' is a new variable. 
Then eventually h will become next{l'), which is in the same clause as 
h. The inner loop will not end because h becomes the complement of h 
since that would mean nextiji) is more recent. 

This gives IV^+^I = [Vl + 1 < [Wl + 1 = [W+^l. 

If we restrict / to V'^', then / is a function from 1^'^' that is onto 
VFI^I violating the PHP. □ 

Theorem 6.4. // the algorithm succeeds, then, for all i, the assignment 
after given at the end of stage i satisfies the first i clauses of F. 

Proof, The proof is done by induction on i. For i = 0, the statement holds 
since there are no clauses to satisfy. As an induction hypothesis, suppose 
the statement holds for i. Then we will show if the algorithm ever visits 
one of the literals in clause n, then that clause is satisfied. 

Consider clause n, where n < i+1. Find the last point in the algorithm 
that cither li or I2 was in clause n, and let / be that literal. First, it is 
possible that when the algorithm ends I2 is in clause n. If I2 is a pure 
literal, then I2 is set to true, satisfying the clause. Otherwise, h and I2 
are in the same clause. In this case, h is true since it was assigned true. 
If I2 ever became Ii, the algorithm would exit the inner loop, so Ii could 
never have been assigned true. 

Second, we consider the possibility that I2 was not in clause n when 
the algorithm ended. Then we claim that I is true, and, therefore, clause 
n is satisfied. Suppose for a contradiction that it is not. Then at some 
later point I was assigned true. This could happen in one of three places. 
First is if h = I and we are at the beginning of the outer loop. However, 
I2 would be set to next(l) right after, which is in clause n. This means we 
did not find the last occurrence of a literal in clause n as we should have. 
A similar argument can be used in the other two places. □ 

We now turn to formalizing this algorithm. For this, we define an Cfl 
function f{i, t) that will return the value of Zi and I2 after t steps in stage i. 
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This is done using number recursion. In the following let f{c,t) = {hjh}'- 

f{i, 0) = {h, h) = min I £ Cf Ah = next{h) 

f{c, t + 1) = {h, h) <->^i D h = next{h) Ah = next{h) 
A^(f>\ A4>2 D {h = h Ah ^ h) 
A-^cpi A -102 3 {h = h Ah = next{U)) 

where 

4>i=h = U 

4)2 = {sameClause{h,l4) W pure Liter al{l4)) 

The formulas (pi and (j>2 are the conditions that are used to recognize when 
the inner loop ends. The first formula is when the loop ends and wc have 
to continue with the outer loop. The second formula is when the stage 
is finished. In the formula version, we do not stop if the algorithm fails. 
Instead we view the algorithm as failing if after steps, (p2 was never 
true. We use this value since |F| is an upper bound on the number of 
literals in F and current state of the algorithm is determined by a pair of 
literal. In the following, any reference to time has the implicit bound of 

The final step is to extract the assignment. The assignment is done 
by finding the last time a variable is assigned a value. This means we 
must be able to determine when a variable is assigned a value. To do 
this, observe that a literal is assigned true just before the next function 
is applied to that literal. With this is mind we get the following: 

Assigned{i,t,l) ^ 3l',f{i,t) = {next{l),l') V f{i,t) = {l',next{l)) 

So Assigned{i, t, I) means that I was assigned true during the tth step of 
stage i. Then we can get the assignment as follows: 

I € Assignment{i, F) <r^c = max3t Assigned{c,t, I) 

c 

At = max As signed{c,t, I) 

Ac = max3t' Assigned{c,t' ,1) 

c' 

At' = max As signed{c' ,t' , I) 

A{c> c V {c = c' At> t')) 

The idea is the value of a variable is the last value that was assigned to 
it. 

The VL proof that this algorithm is correct is the essentially the same 
as the proofs of Theorem 6.3 and Theorem 6.4, which can be formalized 
in VL. This gives the following. 

Theorem 6.5. VL proves that, if the algorithm fails, the formula is 
unsatisfiable. 

Theorem 6.6. VL proves that, if the algorithm succeeds, then, for all i, 
Assignment{i, F) gives a satisfying assignment to the first i clauses of F. 
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6.2 Witnessing GL* Proofs 

Let TT be a GL* proof of a Ej formula 3zP{x, z), and let A be an assign- 
ment to the parameter variables. We assume vr is in free variable normal 
form (Definition 2.8). 

Let Fj — > Ai be the ith sequent in n. We will prove by induction 
that for any assignment to all of the free variables of Ti and Ai, a function 
Wit{i,n , A) will find at least one formula that satisfies the sequent. 

There are two things to note. By the subformula property, every 
formula in Ti is TiCNF{2), which means it can be evaluated. Also, we 
need an assignment that gives appropriate values to the non-parameter 
free variables that could appear. To take care of this second point, we 
extend A to an assignment A' as follows: 
1: Given a non-paxameter free variable y, find the 3-left inference in tt 

that uses y as an eigenvariable. Let z be the new bound variable and 

let F be the principal formula. 
2: Find the descendant of F that is used as a cut formula. Let F' be 

the cut forrrmla. Note that f is a subformula of F' , and, because of 

the variable restriction on cut formulas, every free variable in F' is a 

parameter variable. 
3: Assign y the value that Assignment{F' , A) assigns z. 
The reason for this particular assignment will become evident in the proof 
of Lemma 6.7. 

We can now define Wit{i,TT, A'), which witnesses Ti — > Ai. Wit 
will go through each formula in the sequent to find a formula that sat- 
isfies the sequent. T,CNF{2) formulas are evaluated using the algo- 
rithm described in the previous section. We will now focus our atten- 
tion on other Ylf formulas, which must appear in Aj. Each formula 
F = 3zF*{z) in A is evaluated by finding a witness to the quantifiers as 
follows: 

1: Find a formula F' in tt that is an ancestor oi F, is satisfied by A' , and 
is a Eq formula of the form F*{zi/Bi, . . . , Zn/Bn), where each Bi is 

^0 

2: Zi is assigned T if A' satisfies Bi, otherwise it is assigned 1. 
3: if no such F' exists, then every bound variable is assigned ±. 

Lemma 6.7. For every sequent F, — > Ai in w, Wit{i, n, A') finds a false 
formula in Ti or a witness for a formula in Ai . 

Proof. We prove the theorem by induction on the depth of the sequent. 
For the base case, the sequent is an axiom, and the theorem obviously 
holds. For the inductive step, we need to look at each rule. We can ignore 
V-left and V-right since universal quantifiers do not appear in tt. 

We will now assume all formulas in Fi are true and all T,CNF{2) 
formulas in Aj as false. So we need to find a Ej formula in Ai that is 
true. 

Consider cut. Suppose the inference is 

F,T — > A F — > A, F 
F — > A 
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First suppose F is true. By induction, with the upper left sequent, Wit 
witnesses one of the formulas in A. Then the eorresponding formula in the 
bottom sequent is witnessed by Wit. This is beeause the ancestor of the 
formula in the upper sequent that gives the witness is also an ancestor of 
the corresponding formula in the lower sequent. If F is false, it cannot be 
the formula that was witnessed in the upper right sequent, and a similar 
argument can be made. 

Consider 3-right. Suppose the inference is 

r — > A,F(S) 
r — > A,3zF{z) 

First suppose F{B) is S'. If it is false, wo can apply the inductive 
hypothesis, and, by an argument similar to the previous case, prove one 
of the formulas in A must be witnessed. If F{B) is true, then Wit will 
witness 3zF{z) since F{B) is the ancestor that gives the witness. If F{B) 
is not Eq, then we can apply the inductive hypothesis, and, by the same 
argument, find a formula that is witnessed. 

The last rule we will look at is 3-left. Suppose the inference is 

F{y),r^A 
3zF{z),r — > A 

To be able to apply the inductive hypothesis, we need to be sure that 
F{y) is satisfied. If 3zF{z) it true, then we know F{y) is satisfied by the 
construction of A': the value assigned to y is chosen to satisfy F{y) if it 
is possible. Otherwise, 3zF(z) is false, and we do not need induction. 

For the other rules the inductive hypothesis can be applied directly 
and the witness found as in the previous cases. □ 

Theorem 6.8. VL proves GL* is sound for proofs of Hi formulas. 

Proof. The functions Assignment and Wit are in FL and can be formal- 
ized in VL. A function that finds A', given A, can also be formalized 
since it in VL. The final thing to note is that the proof of Lemma 6.7 can 
be formalized in VL since the induction hypothesis can be express as a 
T,Q (jCfl) formula and the induction carried out. □ 

The reason this proof does not work for a larger proof system, say Gl , 
is because Assignment cannot be formalized for the larger cleiss of cut 
formulas. Also, if the variable restriction was not present, we would not 
be able to find A' in L, and the proof would, once again, break down. 
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